Search CORE

55 research outputs found

Robust Model Compression Using Deep Hypotheses

Author: Armstrong Omri
Gilad-Bachrach Ran
Publication venue
Publication date: 13/03/2021
Field of study

Machine Learning models should ideally be compact and robust. Compactness provides efficiency and comprehensibility whereas robustness provides resilience. Both topics have been studied in recent years but in isolation. Here we present a robust model compression scheme which is independent of model types: it can compress ensembles, neural networks and other types of models into diverse types of small models. The main building block is the notion of depth derived from robust statistics. Originally, depth was introduced as a measure of the centrality of a point in a sample such that the median is the deepest point. This concept was extended to classification functions which makes it possible to define the depth of a hypothesis and the median hypothesis. Algorithms have been suggested to approximate the median but they have been limited to binary classification. In this study, we present a new algorithm, the Multiclass Empirical Median Optimization (MEMO) algorithm that finds a deep hypothesis in multi-class tasks, and prove its correctness. This leads to our Compact Robust Estimated Median Belief Optimization (CREMBO) algorithm for robust model compression. We demonstrate the success of this algorithm empirically by compressing neural networks and random forests into small decision trees, which are interpretable models, and show that they are more accurate and robust than other comparable methods. In addition, our empirical study shows that our method outperforms Knowledge Distillation on DNN to DNN compression

arXiv.org e-Print Archive

Association for the Advancement of Artificial Intelligence: AAAI Publications

Graph Trees with Attention

Author: Bechler-Speicher Maya
Gilad-Bachrach Ran
Globerson Amir
Publication venue
Publication date: 20/09/2022
Field of study

When dealing with tabular data, models based on regression and decision trees are a popular choice due to the high accuracy they provide on such tasks and their ease of application as compared to other model classes. Yet, when it comes to graph-structure data, current tree learning algorithms do not provide tools to manage the structure of the data other than relying on feature engineering. In this work we address the above gap, and introduce Graph Trees with Attention (GTA), a new family of tree-based learning algorithms that are designed to operate on graphs. GTA leverages both the graph structure and the features at the vertices and employs an attention mechanism that allows decisions to concentrate on sub-structures of the graph. We analyze GTA models and show that they are strictly more expressive than plain decision trees. We also demonstrate the benefits of GTA empirically on multiple graph and node prediction benchmarks. In these experiments, GTA always outperformed other tree-based models and often outperformed other types of graph-learning algorithms such as Graph Neural Networks (GNNs) and Graph Kernels. Finally, we also provide an explainability mechanism for GTA, and demonstrate it can provide intuitive explanations

arXiv.org e-Print Archive

Large Scale and Streaming Time Series Segmentation and Piece-Wise Approximation Extended Version

Author: Johan Verwey
Ran Gilad-Bachrach
Publication venue
Publication date: 11/04/2020
Field of study

Abstract Segmenting a time series or approximating it with piecewise linear function is often needed when handling data in the time domain to detect outliers, clean data, detect events and more. The data varies from ECG signals, traffic monitors to stock prices and sensor networks. Modern data-sets of this type are large and in many cases are infinite in the sense that the data is a stream rather than a finite sample. Therefore, in order to segment it, an algorithm has to scale gracefully with the size of the data. Dynamic Programming (DP) can find the optimal segmentation, however, the DP approach has a complexity of O T 2 thus cannot handle datasets with millions of elements, nor can it handle streaming data. Therefore, various heuristics are used in practice to handle the data. This study shows that if the approximation measure has an inverse triangle inequality property (ITIP), the solution of the dynamic program can be computed in linear time and streaming data can be handled too. The ITIP is shown to hold in many cases of interest. The speedup due to the new algorithms is evaluated on a variety of data-sets to be in the range of 8 − 8200x over the DP solution without sacrificing accuracy. Confidence intervals for segmentations are derived as well

CiteSeerX

On Lightweight Privacy-Preserving Collaborative Learning for IoT Objects

Author: Chabanne Hervé
Chaudhuri Kamalika
Gilad-Bachrach Ran
Goodfellow Ian J.
Hamm J.
McMahan H Brendan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 13/02/2019
Field of study

The Internet of Things (IoT) will be a main data generation infrastructure for achieving better system intelligence. This paper considers the design and implementation of a practical privacy-preserving collaborative learning scheme, in which a curious learning coordinator trains a better machine learning model based on the data samples contributed by a number of IoT objects, while the confidentiality of the raw forms of the training data is protected against the coordinator. Existing distributed machine learning and data encryption approaches incur significant computation and communication overhead, rendering them ill-suited for resource-constrained IoT objects. We study an approach that applies independent Gaussian random projection at each IoT object to obfuscate data and trains a deep neural network at the coordinator based on the projected data from the IoT objects. This approach introduces light computation overhead to the IoT objects and moves most workload to the coordinator that can have sufficient computing resources. Although the independent projections performed by the IoT objects address the potential collusion between the curious coordinator and some compromised IoT objects, they significantly increase the complexity of the projected data. In this paper, we leverage the superior learning capability of deep learning in capturing sophisticated patterns to maintain good learning performance. Extensive comparative evaluation shows that this approach outperforms other lightweight approaches that apply additive noisification for differential privacy and/or support vector machines for learning in the applications with light data pattern complexities.Comment: 12 pages,IOTDI 201

arXiv.org e-Print Archive

Crossref